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ABSTRACT 


In this thesis the synchronization requirements of a 
fault-tolerant multiprocessor are defined and methods of main- 
tenance of synchronism are developed. It is demonstrated that 
a synchronous fault-tolerant multiprocessor driven by a fault- 
tolerant clock is more efficient and more easily implemented 
than is an asynchronous f ault-tolerant multiprocessor. 

Fault- tolerant clocking has been examined intensively 
here. From fault- tolerance requirements and the established 
multiprocessor synchronization requirements, general specifi- 
cations are developed for a fault-tolerant clock. Two general 
methods of design have been explored, and it has been concluded 
that if the clock is to be distributed to many modules, fault- 
tolerant clocking through the concepts advanced by William Daly 
and John McKenna of the C.S. Draper Laboratory, is more practi- 
cal to implement than is f ault-tolerant clocking through failure- 
detection and subsequent clock substitution. Clocks developed 
by Daly and McKenna have been examined, refined, and revised. 

It is demonstrated that it is desirable to have available a 
fault-tolerant clock which runs at 20 MHz, but that such a 
frequency is not achieveable by a McKenna-type clock (with use 
of current technology) . A method of achieving the use of a 
relatively slow McKenna-type clock in conjunction with a fre- 
quency multiplier is developed. Also, analog phase-locking 
techniques are shown to be unsuitable for the design of a fault- 
tolerant clock. 

Thesis Supervisor : Albert L. Hopkins , Jr. 

m 

Title: Associate Professor of Aeronautics and Astronautics 
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CHAPTER 1 


FAULT-TOLERANT MULTIPROCESSING 


1.1 Introduction 

The concept of a fault- tolerant multiprocessor was 
developed and explored at the C.S. Draper Laboratory as a 
method of satisfying future spacecraft guidance requirements. 
Future space vehicles will require the handling of additional 
control loops, and as missions become more complex and/or 
lengthier greater reliability is required. In a proposal to 
NASA particular emphasis was placed on the application of a 
fault-tolerant multiprocessor as a space shuttle guidance 
computer. 

1.2 General Characteristics 

The essential elements of a multiprocessor are two or 
more processors capable of simultaneously executing different 
programs (or the same programs) and a common memory accessible 
by all processors. This collection of units has a single path 
for input-output communication. A conceptual diagram is shown 
in Fig. 1.1. 

Because of the parallel operation of the individual 
processors there is a significant increase in computational 
capability. This parallelism lends itself very well to the 
requirement of simultaneous control of many loops . The germ 
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of increased reliability is also inherent within the multi- 
processing concept : processors are alike and hence any proces- 
sor is capable of performing within any control loop at a 
given time; it is this modularity which is the fundamental 
idea behind fault- tolerant multiprocessing. 

1.2.1 Fault-Tolerance 

In general, fault-tolerance is achieved through coded 
redundancy, replicated redundancy, or a combination of the two. 
Except in special instances, redundancy by replication is more 
reliable and more simply implemented (Ref. 1) . Within the 
multiprocessor under study, fault- tolerance is achieved through 
comparison and/or voting amongst replicated units. Some basic 
assumptions in the design of this system are*. (1) failures 
are independent of one another; (2) the same error will not be 
made at the same time by two elements which are in a compari- 
son or voting scheme; and (3) multiple errors will not occur 
so as to outwit the fault-tolerant scheme (roughly equivalent 
to saying that errors will be separated by some minimum time). 
In a system which operates such that failures are independent 
of one another, the probability of assumptions nos. 2 and 3 
being violated is extremely small. 

1 . 3 Particular Configuration 

Figure 1.2 is a representation of the data management 
system recommended for the Space Shuttle. The system is 
designed to meet a fail operational, fail operational, fail 
safe (FOFOFS) specification; by this specification it is 
meant that the system will maintain its performance capabili- 
ties after the occurrence of any two failures and, as a result 
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of a third failure, in the -worst case, suffer a graceful degra- 
dation to a configuration which can still assure safe control 
of the vehicle. The system is hierarchical. The many sensors 
and effectors compose the lowest level. Next up in the hier- 
archy, the local processors transform between the serial-multi- 
plex format of the data bus and the sensor-effector formats? 
also they are charged with the function of assuming those 
burdens which would unnecessarily overload the top level of the 
system. At the top, the regional computer provides data proces- 
sing services to the entire system and manages interactions 
between subsystems. 

The use of multiprocessing techniques in the regional 
computer (RC) serves both to achieve fault-tolerance and to 
yield a larger throughput than would be possible with a simplex 
machine. Within the local processor (LP) duplication of 
processors is used solely as a tool for the achievement of 
fault- tolerance . 

1.3.1 Achievement of Fault-Tolerance Within 

The Regional Computer 

The regional computer multiprocessor configuration 
recommended by the Draper Lab is shown in Fig. 1.3 . Each 
Central Processing Unit (CPU) consists of two processors and a 
triplicated scratchpad memory which stores local temporary data 
and performs input/output (I/O) buffering. The memory, memory 
bus, and data bus are each redundant. The memory may be 1 
accessed by only one CPU at a time, and only one CPU (or LP) 
may be transmitting on the data bus at any time. 


10 





MEMORY BUS 

(auadruply redundant) 



I/O BUS (quadruply redundant) 


Fig. 1.3 Regional Computer Multiprocessor 






CPU error detection is achieved by comparing the outputs 
of the two processors, which run identical programs. The 
detection of a CPU error triggers Single Instruction Restart 
(SIR) , which consists of the moving of the scratchpad contents 
of the "failed" unit into memory and the subsequent loading of 
this information into the next available "healthy" CPU, where- 
upon the failed job is resumed. The recovery is transparent to 
the software. 

Fault- tolerance requirements are met by providing 
sufficient CPUs such that after an established number have 
failed the remainder can provide the necessary response speed 
and throughput to meet the system requirements. Thus, there 
is the advantage of extra processing capability before any CPU 
fails. 

1.3.2 Achievement of Fault-Tolerance Within 

The Local Processor 

Depending upon system requirements a local processor may 
be simplex, duplex, or triplex. In any case error detection is 
achieved as in a CPU: each LP unit contains two processors 
which perform identical operations and compare outputs. Fault- 
tolerance is not through a restart mechanism, however. In the 
case of a duplex or triplex LP, local fault-tolerance is 
achieved by keeping two LP units in synchronism. One feeds the 
data bus and the other has its o\itput blocked; in the event of 
a failure of an LP unit, its output is blocked and the other's 
is enabled. It is because of the differences amongst local 
processors serving different sensor-effector systems as well 
as a desire to limit data bus use that this mechanism for 
achieving fault- tolerance is used rather than an SIR. 


12 



CHAPTER 2 


SYNCHRONIZATION 


2 . 1 Definitions and Requirements 

If fault-tolerance is to be achieved through comparison 
and/or voting amongst replicated units then there are two 
requirements of operation which establish a need for synchro- 
nization. Obviously in order for the comparison to be effect- 
ive, corresponding output information from each unit must be 
compared. Second, in order to assure equality of internal 
operations, accessed input information must be equal at cor- 
responding program points. It should be noted that although 
synchronization is required, simultaneous production of cor- 
responding information or s imultaneous performance of cor- 
responding internal operations are not required. It is best 
to discard the notion that two events can be made to occur at 
the same time ? in a real system there must always be finite 
tolerances in the "simultaneous" initiation of events. It is 
fortunate that the synchronization requirements do not call 
for simultaneity, but, as shall be seen in Section 2.2, the 
impossibility of assuring simultaneity gives rise to difficult 
ies in assuring synchronization. 

2.2 Loss of Synchronization 

Assuming that several modules have been synchr oni zed , 
loss of synchronization may be caused by a slivering of pulses 
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A sliver can occur when two independent observers strobe a 
common signal while it is undergoing a transition. Because of 
gate thresholds and propagation delays, the observers may see 
different values of the signal. Slivering can cause differences 
in the sequences of internal operations (leading to uncorre- 
lated outputs) or in the outputs of replicated comparators. 

If the input or the event being strobed is phase-locked 
(i.e., dependent on the same clock) with the strobe, then 
slivering may be avoided (barring the event of a failure) 
through good design techniques. If, however, the observed 
signal is produced asynchronously, or more generally, if the 
production of the observed signal is uncorrelated with the 
strobe or the call for data, then anti-sliver circuits must be 
utilized as necessary. 

Current designs for anti-sliver circuits require the 
presence of a two-phase clock. A two-phase clock may be simply 
produced as illustrated in Fig. 2.1. The pulses of the two 
phases are mutually exclusive-. The application of this clock 
to an anti-sliver circuit is illustrated in Fig. 2.2. The 
event pulse is stored in the first buffer; as illustrated, 
either the concurrent (as in Case 1) or the next (as in Case 2) 
phase A pulse will cause the event to be stored in the second 
buffer, which is strobed, after settling, by a phase B pulse, 
thereby feeding a healthy signal to both units. 

In a fault- tolerant system, when a replicated group of 
modules is receiving replicated information asynchronously, an 
anti-sliver unanimity circuit may be used to maintain synchro- 
nization. Such a circuit is illustrated in Fig. 2.3. The 

transmitted information from the A. s to the B . s is sliver-free 

l l 
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Phase B 


Maj (A i ) = A^A^ v A^A^ v A^A'a « 

Com(A i M) « A^-gM \ 



Fig. 2.3 Anti-Sliver Unanimity Circuit 
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and all the receive synchronized information from all the 

A.s at the same time so that an accurate vote may be taken, and 
x 

so that the remain in synchronism. Each unanimity circuit 

waits long enough to accumulate all the A_^s but not so long as 
to impair operation due to a failure of one of the A^s . 

The timing chart in Fig. 2.4 illustrates a possible 

sequence of events for the case of no failure. The event to be 

transmitted to the B^s is A^ -» 1. The example shown in Fig. 

2.4 indicates that the events , A. -* 1, occurs first in A , 

!• <6 

second in A^, and last in A^ * Each event, A^ -> 1, is held in 
the corresponding flip-flop, FFA^. Concurrent with the first 
Phase-A pulse shown, only FFA^ and FFA£ are at logic-level-1; 
the flip-flops A^ and A^ are set during the first Phase-A 
pulse. The setting of the flip-flop A^ occurs during the 
second Phase-A pulse. As can be seen in Fig. 2.3, each flip- 
flop A^ is fed to each of the majority and comparator elements; 
the output of Maj (A^) is logic-1 when a majority of the inputs 
is at logic-1, and the output of Comp(A^M) is logic-1 only when 
all the inputs are at logic-1. In this example, Maj (A^) -* 1 
shortly after the first Phase-A pulse, while Comp (A^M) does 
not go to logic-1 until the occurrence of the second Phase-A 
pulse . The occurrence of Maj (A^) -> is causes the counter of 
Phase-B pulses to be reset to zero, causes the flip-flop FFM^ 
to be set at logic-1, and also feeds the delay element A. The 
occurrence of Comp (A J4) -» Is causes FFM^ to be reset (here, 
before the counter reaches an all-1 ' s state, preventing the 
propagation of an error signal) , and causes an output signal 
to be propagated to all B^s. 
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Phase A 


MajfAi) 

Com(A^,M) 

Phase B 
0 

Errors 

Fig. 2.4 Response of an anti-sliver unanimity 
circuit in the event of no failure 




Fig. 2.5 illustrates a possible sequence for the case of 
a failure of one of the A^s. Here A^ is described as having 
failed to produce a bit of information (pulse) . is shown to 

precede A such that the events are passed on to the majority 
and comparator elements separated by one Phase-A period. When 
flip-flop A^ goes to logic-1, A^ has already been set, and the 
majority elements go to logic-1, the comparator elements, how- 
ever, remain at logic-0, as flip-flop A^ was not set. Again, 
as a result of Maj (A^ -> 1, the counter is reset, FFM^ is set, 
and the delay element is fed. It should be noted that the 
time-delay element. A, is used in order to prevent the false 
indication of an error when an all-1' s state is indicated by 
the counter just prior to reset. The delay time required is 
dependent on clock frequency and propagation delay between a 
reset command and a response (assuming the counter was in an 
all-1 's state) at the input of the error-indicating AND gate. 

In the example shown in Fig. 2.5 it is assumed that a 2-bit 
counter is used? hence, shortly after the occurrence of the 
third Phase-B pulse, succeeding the transition Maj (A^) -* 1, 
each line indicates both an output signal and an error. 

2.3 A Synchronized System With Unsynchronized 
Elements 

Before considering a multi-layered hierarchial system, 
it is wise to look at a simple model of this problem. Consider 
two modules each with one input line and one output line, each 
receiving and transmitting data serially. At an arbitrary 
time (t = 0) , there is no signal on any line and the internal 
states of the modules (processors) are equivalent. For all 
time, t > 0, equivalent input data is received in serial bytes 
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by both modules (not simultaneously) ; it is desired that the • 
modules perform identical operations with identical internal 
and input data, and that corresponding bits of outputted data 
be recognized in order to facilitate comparison. In this 
analysis it is not necessary to consider whether the processors 
are clock-driven (synchronous) or asynchronous machines; in 
either case, corresponding prodxiced data bits are separated in 
time. 

Not only may input data be received at different times 
by the modules, but more significantly it. may be received at 
different points in the program being run by the two modules. 

In the worst case such a condition may cause calculations to 
be made with different numbers, or a branch to occur in one 
processor but not in the other. 

Two theorems, which taken together contend that two 
independent processors (or, in general, modules) may be 
synchronized, are stated and proved below. 

Theorem 1 . Two independent modules can be made to 
perform identical operations with identical internal and input 
data. 

Proof . Assume there is an interfacing unit associated 
with each module's input. The interfacing units may communi- 
cate with each other as well as with their associated modules - 
see Fig. 2.6. As in the case of the modules, the structure 
and operation of the interfaces are identical with each other . 
Assume that incoming bytes of information are buffered in 
cor re sponding registers within each interface. Let each 
register have two extra bits (n extra bits - in the case of 
n parallel modules) above the number used to store input data; 
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these bits are used as check bits z a 1 in, say, the left bit 
will be taken to mean that the input byte to ''me" has been 
stored in this register, and a 1 in the other bit will mean 
that the corresponding input byte to the other module has been 
stored in the corresponding register- Hence each module may be 
aware if corresponding information is available to both. The 
precise nature of the structure, possible microprogram, or 
requirements for fault- tolerance, of the interface units need 
not be considered presently. Rather the purpose of this 
discussion is to determine, first, if a system meeting the 
requirements (stated earlier in this section) is possible. 
Through use of these check bits by the program the modules may 
be kept in synchronization. Note that it is important that 
updating of memory associated with a module by input data be 
controlled by the program so as to assure equality of available 
information at corresponding program points. Anti-slivering 
circuits will not be needed between interface and module as 
the structure of the programming will exclude slivering 
difficulties . 

Theorem 2 . Corresponding bits of output data produced 
by two independent modules can be recognized. 

Proof . For purposes of buffering of information and 
comparison allow an interface unit to be associated with each 
module's output. To allow comparison these interfaces have 
communication with each other. See Fig. 2.7. At an arbitrary 
t = 0, the registers of the output buffers are clear and no 
output bytes of information are stored in the registers of the 
interface. When corresponding registers have been written 
into, the contents are compared by comparators within both 
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A out 


B out 


Fig. 2.7 Synchronization of Asynchronous Modules 





interface units. If there is no error the data is released to 
output and the registers read are cleared. 

In order to determine how much buffering space is 
required consideration must be given to such things as: peak 
rate of output production, maximum time separation in product- 
ion of corresponding information; and rate of comparsion and 
clearing of registers within the interface units. If we define 
the following quantities : 

p: peak rate of production (in bits/sec), 

n: length of one register (in bits) , 

t c ? time required for comparison and subsequent 

outputting and clearing of register (in sec.), 
t g : maximum time separation in production of 
corresponding bits, and 
x: the number of registers required for one 
output interfacing buffer, 


then if 


t ^ n bits 
c p bits/sec 

x - n (pt s + 2n) 


If the nature of the processor is such that it produces infor- 
mation in serial bytes, then the peak rate of production of 
bits, averaged over several bytes, will be less than p. If 
we define: 


as peak time averaged rate of production 
(in bits/sec. ) , and 
t^s averaging time. 
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then if 


n bits 
p bits/sec 


< t < 


n bits 
a bits/sec 


at , at pt 

a _ _JL . a. + s + 2 

x n ~ t ' p ' n 
c 

n bits 

In the worst case t = ■ — — 7- — , then: 

c a bxts/sec. 


at pt 

x = - — — (1 - — ) -i + 2 

n p n 


A numerical example would be helpful; considers 
7 

p = 10 bits/sec, 

n = 16 bits, 

-3 

t = 10 sec, 

s 6 

a = 10 bits/sec, 

t = 1 sec, 
a 

then, if t < — : 
c p 

pt s 

x = + 2 = 627 registers; 

but if “ < t < ~ : 
p c a 

at pt 

x - — - (1 - — ) + — - + 2 = 56 , 877 . 
n p n 

It can be seen that if the time required to ready a full 

register for the next load, t c , is greater than the time it 

takes the processor to fill a register, — » then the buffering 

P n 

requirements are large. If t is larger than —, then the 

c a 

buffer must have an infinite number of registers in order to 
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assure successful operation of the interfacing system. 

The elements of design of this elementary system may be 
applied to a multi-layered hierarchical system. The primary 
drawback is in the requirements which must be imposed on the 
software of each processor in oarder to maintain synchronism 
of operations amongst replicated units. Different software 
"tricks" will be required for different input information 
usage; any requirement imposed on software for purposes of 
maintaining synchronism will serve to decrease processor 
speed and hence overall system speed. It is generally poor 
design procedure to depend on software improvisation for 
system operation. 

2.4 System Synchronization Through Use Of A 
Common Clock 

Consideration is now given to a system in which all 
units are synchronous machines and one clock is used for the 
driving of all units. Replicated units which are driven by 
the same clock may be defined to be in synchronism in the case 
of units which, for purposes of fault tolerance, run the same 
program and receive the same data (the input being controlled 
by the same clock) , the initiation of each corresponding 
microprogram step as well as the receipt of corresponding bits 
of information occur concurrently (plus or minus some small 
tolerance) ; such units are said to be in tight synchronism. 

First consider the same elementary problem explored in 
Section 2.3s the synchronization of two processing units. 

Even though corresponding input information and corresponding 
program steps are synchronized by the same clock, slivering 
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may allow one unit to recognize an input a . microstep before 
the other . However , to maintain tight synchronism anti-sliver 
circuits are not necessary; two-phase clocking is sufficient 
to avoid slivering: one phase (A) is used for receiving and 
transmitting of information and the other phase (B) for 
driving the processor. Output bits produced by phase A are 
buffered and then compared and transmitted by phase B. 

In a multi-level hierarchical system, such as the 
C.S. Draper Laboratory Space Shuttle Guidance Computer pro- 
posal, this method of synchronization should be adequate for 
the entire system. However, information transmitted from 
sensor to local processor is not likely to be synchronized 
with the system clock; for such an interface anti-sliver 
circuits (or anti-sliver unanimity circuits where called for 
by fault tolerance requirements) can provide the necessary 
synchronization of receipt of information by local processors. 

2.5 Conclusions 

At first glance the system described in Section 2.4 is 
quite simple and desirable, especially in light of the alter- 
native (Section 2.3) . The difficulty in the design of a system 
synchronized through use of a common clock lies in the design 
of the clock. Such a clock must meet the fault- tolerant speci- 
fications both in its internal structure and in its distri- 
bution around the system; this is no easy task. Nevertheless 
it is felt that it is much more desirable to add to the com- 
plexity of hardware design by calling for a fault- tolerant clock 
than it is to suffer the pains of dependency on software impro- 
visation required in an unsynchronized system. It should also 
be noted that although a synchronous process is generally slower 



than an asynchronous processor, an asynchronous fault-tolerant 
multiprocessor, due to increased software requirements and 
necessary stop and wait periods, would probably be slower than 
a fault- tolerant multiprocessor driven by a fault- tolerant 
clock. 
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CHAPTER 3 


CLOCKING 


3.1 Specifications of a Fault-Tolerant Clock 

As stated in Section 2.5, if a common clock is to be 
used to drive the system, it must meet the fault- tolerance 
specifications of the overall system. Whether the system 
specification be fail operational or fail safe, the clock 
specification must be fail operational? the clock is as funda- 
mental to the system as the power supply. In the case of a 
fault- tolerant clock designed to drive a Space Shuttle guidance 
computer, the clock would need to be able to perform after the 
occurrence of any combination of three independent failures. 

Of prime importance in the design is that the synchro- 
nized state of the system is affected neither by any mode of 
failure of the clock nor by the method of recovery from the 
failed state (i.e., the synchronization of the system must be 
transparent to clock failures) . 

For purposes of design and discussion, the distribution 
of the clock to all parts of the system will be considered as 
a part of the clock design? this seems logical, as different 
concepts of fault- tolerant clocking may conceivably warrant 
different methods of distribution. It should be realized, 
however, that one of the keys to a good design will be mini- 
mization of the number of wires required for dis tr ibution . In 
a system such as that required for the Space Shuttle, distances 
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between modules may be on the order of 100 feet? in such a 
geographically distributed system wiring may assume a large 
share of the cost and complexity. 

As a fault-tolerant computing system may have on the 
order of hundreds of modules, it is desired to minimize the 
logic required within each to convert the distributed clock 
information into the one train of clock pulses which is used 
for driving the module. Any failure of this logic will be 
considered as a failure of the entire module, which will be 
detected by comparison of outputs amongst replicated units. 

3.2 General Methods of Design 

Two general design approaches come to mind: (1) use a 
single clock in conjunction with a single-wire bus for distri- 
bution until the occurrence of a failure in the oscillator or 
in the distribution, at which time another clock and its 
associated bus are brought into action? this principle is 
illustrated in Fig. 3.1? the enable circuits permit only one 
clock to be distributed at a time; (Bnable) n passes clock n if 
and only if failure detectors 1 through n-1 indicate failure 
(initially clock 1 is distributed) ? (2) use a group of mutually 
synchronized oscillators which can tolerate the required number 
of failures and still have several "good" outputs? see Fig. 3.2. 

3.3 Fault-Tolerant Clocking Through Failure-Detection 
and Subsequent Clock Substitution 

In this section, through logical development, an explor- 
ation is made of the feasibility of a clocking system which 
achieves fault-tolerance through failure-detection and subsequent 
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E^: (Enable)^ 

PDj_s (Failure Detector)^ 


Fig. 3.1 Fault-Tolerant Clocking through 
Failure-Detection and Clock Substitution 
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Fig. 3.2 Fault-Tolerant Clocking 
through Synchronization of Oscillators 
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clock sxabs titution. Consider Fig. 3 .3 ; the clock system and 
the clock bus are to be designed to be fault- tolerant; neither 
the connection between module and bus, nor the module trans- 
ducer need be fault-tolerant, as a failure there may be con- 
sidered to be a failure of the associated module. The design 
of the module transducer and its connections to the bus, how- 
ever, is an integral part of the design of the clocking system; 
the module transducer converts the information on the bus into 
a single continuous clock waveform and needs to be designed 
such that the outputs of all module transducers are in synchro- 
nism. It will be seen that some elements of the clocking 
system need to be external to the module, while others need to 
be associated with the module. 

In order to simplify the feasibility study, system design 
for single-fault- tolerance will be explored first. Figure 3.4 
is a general description of a single-fault-tolerant clock. In 
order to assure that each module utilizes the same clock, 
failure detection should be. external to the module. 

The most obvious difficulty in designing the failure- 
detection and reconfiguration scheme is maintenance of synchro- 
nization through the failure and reconfiguration process. In 
order to prevent the failed clock from feeding the data manage- 
ment system, the clock waveform must be tested for failure 
before it is used; but in order to detect failures in distri- 
bution, the clock waveform must be tested after distribution, 
it would appear that each module transducer must be designed to 
"hold" (delay) use of the clock waveform until it is sure that 
a failure has not occurred. When a failure is detected each 
transducer holds its output at, say, logic-level-0 , until after 
the clock system has been reconfigured. and a "good" clock 
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Fig. 3.3 General Fault-Tolerant Clocking Concept 


6 





reset 



E: Enable 

FD: Failure Detector 


Fig. 3.4 S ingle-Fault-Tolerant Concept 
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waveform is available, at which time each transducer passes the 
good clock waveform. The difficulty now is in assuring that 
synchronization is maintained through the local transducer 
process of; output-hold-output. 

If one of the enable circuits fails such as to produce a 
random output, the module transducer is required to choose the 
"good" waveform, if single-fault-tolerance is to exist. The 
amount of circuitry required for the transducer to choose the 
"good" waveform can be reduced if the state of the failure 
detector is made available to the transducers (via a failure 
detector bus) ; if this is done, the enable circuits shown in 
Fig. 3.4 become superfluous. Figure 3.4 may be revised as 
illustrated in Fig. 3.5. The failure detector may be simply 
implemented as illustrated in Fig. 3.6. The circuit is 
designed to allow a tolerance on clock pulse width and separ- 
ation between pulses; if the tolerance is violated the output 
of the failure-detector goes to, and is held at, logic-1; the 
reset capability is provided for initializing the clocking 
system. The pulse widths of the one-shot outputs determine the 
tolerance; it is a straightforward procedure to determine the 
necessary one-shot timing, given; clock frequency, duty-cycle, 
and allowed variations in both, as well as data concerning 
tolerances, of the propagation delays and one-shot pulse widths, 
associated with the failure-detector components. The retrig- 
gerable one-shot should have an output pulse width of approxi- 
mately twice the period of the clock; it assures the detection 
of a failure to logic-level-0 or logic-level-1. 

Figure 3.7 is the design of a module transducer which 
may be used in conjunction with the clock system of Figures 3.5 
and 3.6. is the delay associated with each module transducer; 
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Fig. 3.5 Revised S ingle-Fault-Tolerant Concept 



Fig. 3.6 Failure Detector 
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Pig. 3.7 Module Transducer for Single-Fault-Tolerant 
Clocking through Failure-Detection and Subsequent Clock Substitution 







it allows the prevention of the propagation of a failed clock. 

In a system state prior to clock failure, the waveform of clock 
1 is passed through delay, A^, and one-shot, o.s.^, and then to 
the output . If a failure is indicated, FF^ -» 1, causing the 
termination of the distribution of clock 1 and the subsequent 
distribution of clock 2 to each module. Additional circuitry is 
provided in each transducer for maintenance of synchroniz a tion 
during and after the switching period. Slivering within the 
transducer in effecting the cut-off of clock 1 and the cut-in 
of clock 2 may, in some cases, yield an extra pulse associated 
with clock 1 or an extra pulse associated with clock 2 ; thus 
after the switching has taken place the total number of clock 
pulses supplied to each module may differ by one or two. If an 
''extra" pulse of clock 1 is propagated, FF^ is set, and if an 
“extra" pulse of clock 2 is propagated. FF 3 is set. In those 
transducers in which FF£ or FF^ was not set during the switch- 
ing process, one extra pulse for each unset flip-flop is insert- 
ed between the end of the clock 1 waveform and the beginning of 
the clock 2 waveform, thereby maintaining synchronism. Require- 
ments are established for values of A^, A^ t A^, A^, and the 
pulse widths of the one-shots, as well as their tolerances t the 
requirements are imposed by the system parameters (e.g., clock 
frequency) , as well as by the required method of operation. 

It is believed that the module transducer shown in Fig. 
3.7 is an example of a minimally complex (or nearly so) trans- 
ducer required for fault-tolerant clocking through failure 
detection and subsequent clock substitution. The necessity of 
requiring such complex operations to be performed on the module 
level , rather than the external clock system level, has been 
justified in the development of this section. Because of the 
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module-level complexity here, it is felt that designs such as 
described in later sections of this thesis, are much more 
desirable; hence a detailed analysis of the module transducer 
illustrated in Fig. 3.7 is not presented. 

For n-fault-tolerance, the system becomes more compli- 
cated on all levels, and, of course, it is the increased com- 
plication on the module level which is particularly undesirable. 
It is concluded that fault-tolerant clocking through failure 
detection and subsequent clock substitution is possible, but 
is extremely costly in hardware implementation. 


3.4 The McKenna Clock 


3.4.1 First Concept 


In August, 1971, William Daly and John F. McKenna, in a 
C.S. Draper Laboratory memo (Ref. 2), described their design of 
a fault- tolerant clock. Figure 3.8 illustrates the concept 
proposed for single-fault- tolerance. It is seen that this 
design conforms to the method of clocking shown in Fig. 3.2 and 
indeed may be described as a synchronization of oscillators. 

It should be noted, however, that here a single clock element, 
apart from the others, is not an oscillator. Rather, as shall 
be seen in the analysis to follow, each clock element depends 
on the occurrence of the transition in state of several of the 
clocks in order to be driven to change its own state. 

The quorum function is defined to be 1 if at least a 
of the n independent variables C, , C_, C are 1, and 0 

JL .2 n 

otherwise. For example : 


Q 1 - C 1 + C 2 + C 3 + C 4 
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1 



c 2 


c 3 


C 4 


Clock 





Q and Q may each be realized through two levels of Boolean 
<» ^ 

gating or through one level of threshold logic. 

The use of threshold logic, however, offers 

no advantage unless LSI threshold logic technology is to be used. 

All clock lines are distributed to each synchronous 
module within the system. It will be demonstrated in a later 
part of this section that majority voting and subsequent filter- 
ing within each module is necessary to maintain system synchro- 
nization,. and an adequate module transducer will be described. 

The logic required to induce free-running oscillation 
after power-on is not shown in Fig. 3.8. For the purposes of 
this analysis it will be assumed that the clock elements are 
already oscillating in synchronism at the time of observation. 
Timing analyses will be made, to demonstrate system performance. 


For purposes of analysis the following assumptions are 
made: each gate has a propagation delay equal to A; the propa- 
gation delay in forming a quorum function is 2A; At is a pure 
delay, greater than or equal to 8A (in order to avoid slivering 
within the clock element which could yield spikes in the output. 
At must be greater than the propagation delay through the 


series of gates : M 21' N 41' 4A? however , 

since by most manufacturers ' specifications propagation delay 


within a simple Boolean gate may reach to nearly twice the 
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typical delay time, in the worst case the delay through four 
gates in series may approach 8A. 


Assume that the clock elements are oscillating in synchro- 
nism. Assume that at time: t=t^, goes to logic -1 y t 31 ^* 

C 2 -* 1* t-t^ , C 3 "* an< ^ at t=t 4 » C 4 ~* 1 (see Fig. 3.9), where 
t 4 > t > t > t^ Assume that at an initial time of observation, 
(t^ - e), all propagation, within each clock element, caused by 
the previous transition, 0, has ceased (this will be veri- 

fied within the analysis to follow); therefore the outputs of 
the gates (Ref. Fig. 3.8) are as follows: 


Initially: 


c i = 0 

<4 - 0! 

°ii - ° ! 
H ii - 1; 


4 

°2i 

H 2i 


0 

0 

1 


A li ~ l! 

A_ . = 0; 
2x 

M- . 
lx 

A 3i = l! 

A 4i = ° ; 

M_ . 
2x 

A 5i - ° ! 

A 6i = °* 

M_ . 
3x 

A 7i " ° ! 

A_ . =0? 
8x 

M . 
4x 

N li " lf 

N . . = 1 
4x 



N . = C. 
2x x 


= 0 


N 3i " 1 


0 

0 

1 

1 


Given the initial state and the assumed progression of 
events , the timing analysis is as follows: for reference 
purposes all transitions are numbered. 


V 1 at *1 (N 21^ 


(3.1) 
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t 


Pig. 3.9 Assumed Synchronism 
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4 4 

Q 2i and Q 3i remain at 0 

. * . gate X . , i = 2,3,4, remain unchanged from 

X3L 

the initial state 

0 at tj+ A (the validity of this statement 
is dependent on the nature of transition (3.1)) (3.2) 


C 2 -* at fc 2 (N 22" 1} 

Q* 1 at t + 2A 

2i 2 

4 

Q . remains at 0 

N 32 -» 0 at t + A (as in (3.2)) 

A_.-» 1 at t + 3A 

2i 2 

D_ . 1 at t + 2A + At 
li 2 

H_ 0 at t_ + 3A + At 
li 2 

A 1;L -» 0 at t 2 + 4A + At 


(3.3) 

(3.4) 

(3.5) 

(3.6) 

(3.7) 

(3.8) 

(3.9) 


It has already been assumed that At ^ 8A, but it is still 
interesting to note that for At = 0, a short duration pulse 
might or might not be generated by M^, yielding possible 
slivering within the clock's logic and hence unpredictable 
operation . 


C 3 - 1 

at fc 3 

(H 23 1) 

N 33^° 

an t 3 

+ 

A (as in 

4 

Q_ . remains 

2i 

at 

1 

D n . remains 
li 

at 

1 


at t 3 

4* 

2A 

v° 

at t 3 

+ 

3A 


(3.10) 

(3.2)) (3.11) 

(3.12) 

(3.13) 
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A_ 0 

3x 

at 

*3 + 

4A 

(3.14) 

M 1 

2i 

at 

s + 

5A 

(3.15) 

H 4 r 0 

at 

*3 + 

6A 

(3.16) 

°2i^ 1 

at 

*3 + 

2A + At 

(3.17) 

A 8i^ 1 

at 

*3 + 

3A + At 

(3.18) 

M 4i“*° 

at 

fc 3 + 

4A + At 

(3.19) 

“31-* 1 

at 

ft 

U) 

4* 

5A + At 

(3.20) 

III 

•H 

CM 

13 

c i> 

-» 0 

at t^ + 6A + At 

(3.21) 

N„ 1 

4x 

at 

s + 

5A + At 

(3.22) 

A. 1 

4x 

at 

fc 3 + 

6A + At 

(3.23) 

0 

2x 

at 

fc 3 + 

7 A + At 

(3.24) 

a 8i-*° 

at 

*3 + 

8A + At 

(3.25) 

“41-* 1 

at 

fc 3 + 

9A + At 

(3.26) 

result of 

(3, 

.21): 



4 

Q_ . 0 

2i 

at 

*3 + 

8A + At 

(3.27) 

4 

q 3 .-° 

at 

*3 + 

8A + At 

(3.28) 

*2i^° 

at 

*3 + 

9A + At 

(3.29) 

1 

lx 

at 

fc 3 + 

10A + At 

(3.30) 

N. 0 

lx 

at 

s + 

11A + At 

(3.31) 

D. . -* 0 
lx 

at 

*3 + 

8A + 2At 

(3.32) 

H. 1 

lx 

at 

*3 + 

9A '+ 2 At 

(3.33) 

A . -* 1 
5x 

at 

s + 

10A + 2 At 

(3.34) 
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M 3i~* 

0 

at 

s 

+ 11A + 2 At 

(3.35) 

< N 2i 

= 

c i> 

— > 

1 at t„ + 12A + 2At 
o 

(3.36) 

N 3 r 

0 

at 

fc 3 

+ 13A + 2 At 

(3.37) 

N u^ 

1 

at 

fc 3 

+ 12A + 2 At 

(3.38) 

A u" 

1 

at 

*3 

+ 13A + 2 At 

(3.39) 

M u" 

0 

at 

*3 

+ 14A + 2 At 

(3.40) 

A 5i 

0 

at 

*3 

+ 15 A + 2At 

(3.41) 

3x 

1 

at 

fc 3 

+ 16A 4- 2 At 

(3.42) 

=21-* 

1 

at 

fc 3 

+ 9A + At 

(3.43) 


1 

at 

fc 3 

+ 10A + At 

(3.44) 

D 2i^ 

0 

at 

*3 

+ 8A + 2 At 

(3.45) 


0 

at 

*3 

+ 9A + 2At 

(3.46) 

A 4i^ 

0 

at 

*3 

+ 9A + 2At 

(3.47) 


It is seen that the last transition of each gate (up to 

transition (3.47)) restores the gate to its initial setting. 

Because of transition (3.36) , 0 at a time later: 6A + At; 

and because of C. -*■ 0, C.-» 1 in another increment of time: 

i i 

6A + At. The duty-cycle of is 50%. The period is: 

T . = 12A + 2At 
cx 

Hence the maximum frequency is: 

f = -~ 
max 28A 

For medium speed TTL, A ^ 12 ns; therefore f 3 MHz. 

max 

With use of the above timing analysis, the assertion that 
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At must be greater ■: han 4A, for successful operation, may be 
demons tr a ted . Assume that A < At < 4Aj then transition (3.17) 
occurs before (3 . 16) and a possible timing sequence is as follows : 


D . — » 1 
2x 

at 

*3 

+ 

2A 4- 

At 

(3.48) 


at 


4* 

6A 


(3.49) 

A 8i^ 1 

at 

fc 3 

+ 

3A 4- 

At 

(3.50) 

A..-+ 1 
4i 

at 

*3 

4* 

3A 4- 

At 

(3.51) 

A 4i“* 0 

at 

*3 

4* 

7A 


(3.52) 

v° 

at 

fc 3 

4* 

4A 4- 

At 

(3.53) 

M 4i -° 

at 

*3 

4- 

4A 4* 

At 

(3.54) 

“ 21 " 1 

at 

*3 

4- 

8A 


(3.55) 

» 4i - 1 

at 

s 

4* 

+ 

<1 

in 

At 

(3.56) 

N 3i - 1 

at 

s 

4- 

5A 4- 

At 

(3.57) 

*81* ° 

at 

*3 

4- 

5A 4- 

At 

(3.58) 

A 8i" 1 

at 

*3 

4- 

9A 


(3.59) 

1 

at 

*3 

4- 

6A 4- 

At 

(3.60) 

as 

to 

H 

lii 

c i> 

( 

3 at t 3 

4- 6A 4- At 

(3.61) 

M. 1 

4i 

at 

*3 

+ 

6A 4- 

At 

(3.62) 

M. 0 
4i 

at 

*3 

4- 

<3 

o 

r~! 


(3.63) 


Transitions (3.61) and (3.62) both occur at t 3 + 6A + At, 
but if (3.62) occurs just before (3.61) the following transition 
may occur in some clock elements : 

N 3i -» 0 at t 3 + 7A + At (3.64) 

(N 2i = C.H 1 at t = 8A 4- At (3.65) 
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So, it is seen that for A < At < 4A, proper operation can- 
not be assured. 

Now assume that At < A; then as a result of transition 
(3.12) : 


D_ . 1 at t, + 2A + At 

2x 3 

H_ . 0 at t- + 3A 

2x 3 

A. 1 at t, + 3A + At 


A 3i -» 0 at t 3 


+ 4A 


(3.66) 

(3.67) 

(3.68) 

(3.69) 


Since (3.68) occurs before (3.69), will not go to 1 and 

therefore will not go to 0, yielding a non-oscillatory con- 
dition. 


It has been shown, in support of the original assertion, 
that At must be greater than 4A. Also, as mentioned, in order 
to assure operation in the event that gate propagation delays 
are nearly double nominal value. At should be no less than 
8A 

nom. 


The principle of operation is seen to be as follows s when 

4 4 

Q_—> 1, C.~* 0 6A 4- At later? when Q -* 0, C. - * 1 6A + At later. 

Differences amongst clock elements in the propagation of the 

4 

signal triggered by the leading edge of the pulse or in the 
propagation of the signal triggered by the trailing edge of the 

4 

Q pulse will cause minor time separations in occurrences of 

leading and trailing edges amongst clock element pulses. It 

4 

should be noted that if for clock element 1, Q„ 1 before 

21 4 

H, 1, then C. is driven to logic -1 5A after Q_ -* 1. 
lx x^ ■ ' 2 

Similarly, if Q 0 before D 1, then C . is driven to logic -0 
6A after Q^O. 
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For purposes of determining the effect of differences in 

propagation delays on clock performance consider the foliowing 

definitions and analysis; first, assume that the delays of 

each clock element are within tolerances such that the "set-to- 
4 4 

agree" (Q 2 “* 1 drives C^-» 1 or 0 drives C^-» 0) function is not 

utilized in normal operation; now, define the time between the 
4 

event 1 (here, Q refers to the conceptual quorum function, 

not the physical implementation) and the resulting event c.-» 0 

4 1 

as (6t.) . ; define the time between the event Q„-» 0 and the 
dr 2 

resulting 1 as (St^)^. Once the event Q 1 has occurred, 

Q 0 will occur (St ) later, where x is the clock possessing 
2 Cl X ^ 

the next to the largest (5t^) ; after 0 occurs, 1 will 

occur (St ) later, where y is the clock possessing the next to 

u y 

the largest (5t u ) . 

Two specifications which considered together offer a signi- 
ficant measurement of clock performance, may be defined as 
follows : 

AT„ = duty-cycle of the function C. C_ C_ C. 

ATd S duty-cycle of the function C^. 

These two specifications indicate, respectively, the size of the 
overlap region of the clock pulses and the size of the overlap 
region of the clock 0-states. If there were no differences 
amongst propagation delays of like elements then At, and At. 

1 , Q, 

would both be “ . Following is a derivation of the relationships 
between AT^, At^ and the (St^) 's, (5t^) ' s; 

Assign numbers to the clocks such that; 

(8t d ) i« 8t d>2« 5t d ) 3« 8t a , 4 « 3 - 70 > 
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Assign letters to the clocks such that: 


(St ) <(6t ) <(St ) <(5t ) 
u & u d uc u a 

There are 41 distinct sets of the four clock outputs , yet in 
each case: 

the period of the clock is given by: 

T = (5t d ) 3 

AT U = T [< 

AT a“? [ < 


(3.71) 


and 


' (6t u>c 



(3.72) 

a’l ~ (6 Va + 

<®Vo 

] 

(3.73) 

u>a * < 5t d>4 + 

(6t d>3 

] 

(3.74) 


Two cases are illustrated (Figs. 3.10, 3.11) for the purpose of 
shedding some light on why At^ and AT^ are independent of the 
manner of the pairings of the (St^) 's with the (St u ) 1 s in the 
four clocks. 


The percentage variation of the (St^)'s and the (St^) 's 
around some nominal value is dependent on both component speci- 
fications and component selection; testing and subsequent 
selection of components will yield a minimum variation. A worst- 
case analysis will yield a direct correlation between clock 
performance, as measured by At^ and AT^, and the tolerances of 
the (5tj) 's and (St u )'s. If (St u ) and (St^) fall within a range 

(5t) (1— x) < St < (St) (1 + x), then from eqns (3.72), (3.73), 

nom nom 

and (3.74) : 


Cat ] . = Cat] 

u mxn a mxn 


f l-3x „ 1 

— . x <J 

°, X 2 J 


(3.75) 


Note that due to the " set-to-agree" function it is unrealistic 
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AI„ - i[(8t d)l - (6t u ) d + CSt u ) c ] 
AT d = i[(»t u ), - C#t d ) 4 + (St d ) 3 ] 


Fig. 3.10 Determination of At u 
and AT d - Case 1 
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T = (&t d ) 3 + (St u ) c 

AT U *^C(St d ) 1 - (5t u ) d + (8t u ) c ] 

AT d = ^[(*t u ) a - (8t d ) 4 + (St d ) 3 ] 

Fig. 3.11 Determination of AT U 
and AT d - Case 2 
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• *. * , 

to assume that [ (& t^) 3 - ( 6t d ) or [ is lar 9 er 

than 6A, or that [ ( & t u -) c - ( 8t u ) or [ ( &t u ^ ~ ^ St u^b^ is lar 9 er 

than !5A, but that the above analysis is valid because, in the 

worst case (St,) and (St.) . may be equal to (St) (1-x) and, 
u a a l noiti . 

the other (St.) 's and (St,)'s equal to (St) (1+x) ; in this 
a u nom . 

worst case the set-to-agree function is not utilized. 

By nature of the design of the clock, any single failure 
can directly affect the output of only one clock element. In 
order to examine the post-failure operation of the circuit it is 
not necessary to examine the failure-modes of every logic gate; 
rather it is sufficient to examine the effect on clock operation 
of one failed clock element. The operation of the clock is 
examined below for three modes of failure of any one clock 
element; failed to logic-level-1, failed to logic-level-O, and 
random oscillation. Flaws in the design of the clock will be 
exposed, and a revised design will be recommended. 

The following definitions will be useful in the analysis; 

3 

Q = 1 if and only if n out of the three "good" 
clock elements are at logic-level-1. 

Using previously defined nomenclature, assign 
numbers and letters to the three "good" clock 
elements such that; 

< & Vi < < 8t d>2 < (5t a>3 

<5 Va < <5t u>b < (8 Ve 

Denote (5t^) and (&t u ) of the failed clock element, prior 
to failure, as (St^) f and (5t u ) f , respectively. 

If the failure is to logic-level-1 and if the system does 
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not fail as a result of the occurrence of the failure, then the 

4 3 4 

event Q -» 1 is equivalent to Q -* 1, the event Q -» 0 is equivalent 

to Q.-» 0 , and the principle of operation of the three good clocks 

1 3 3 

remains the same: once the event Q -» 1, has occurred, Q -> 0 will 

3 3 

occur (&t^) ^ later ? as a result of the event 0 , 1 will 

occur (&t u ) b later. Therefore the period of oscillation of each 


good clock is: 



< St d>3 + (S Vb 


(3.76) 


where the subscript fl denotes failure to logic-level-1. Unless 

( & t(j)f > (Bt^)^ an< ^ ( 8t u )f ^ P er ^°^ prior to failure 

differs from T,., . 

fl 

If the failure is to logic-level -0 and if the system does 

not fail as a result of the occurrence of the failure, then the 

4 3 4 

event Q_-* 1 is equivalent to Q -» 1, the event Q -> 0 is equivalent 

3 6 3 6 3 2 

to Q_-» 0, and: once Q_-» 1 has occurred, Q_-» 0 will occur (&t,) _ 

2. J ^ 3 2 ■ a 2 

later? as a result of the event Q_-» 0, Q_-> 1 will occur (§t,J 

Z 3 U c 

later. Therefore the period of oscillation of each good clock 
is: 


< St d >2 + ( 5 t u>c 


(3.77) 


where the subscript fO denotes failure to logic- level-rO. Unless 
^ &t d^f ^ ^ &t d ^2 and ( 5 t u>f > (&t u ) c , the period prior to failure 
differs from 

Prior to failure if ( 6 t..)_ > (St,)- and (&t )_ > (&t ) 

Cl I CX 3 U X U c 

then: 

1 

" < 5 t d>3 + (5 Vc 
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If (St d ) f > (St a ) 3 and (5t u ) b < (Bt u ) f < (St u ) c then: 


(Bt d ) 3 + (6t u ) f 


If (St a ) £ > (at a ) 3 and (6t u ) f < (6t u ) b . 


T = T 


fl (6t d ) 3 + (8t u ) b 


If (»t a ) 2 < <st a ) f < (8t d ) 3 and (5t u ) f > (6t u ) c , 


T = 


(6t a ) f + (ty c 


If (8t a ) 2 < (8t a ) f < (5t a ) 3 and (St u ) b < (St u ) f < (8t u ) c . 


T “ 


(8t a ) f + (8t u ) f 


If (8t a ) 2 < (St a ) £ < (8t a ) 3 and (8t u ) £ < (St u ) b , 


T = 


(8t a ) f + (St,,) 


If (8t a ) f < (8t a ) 2 and (st u ) £ > (8t u ) c , 


T = T„ - 


fO (8t d ) 2 + (8t u ) c 


If <8t d ) f < <8t d ) 2 and (8t u ) b < (8t u ) f < (St,),., 


T = 


(8t d ) 2 + (8t u ) f 
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If (8t fl ) f < (Bt a ) 2 and (6t u ) f < (St^. 


(St d ) 2 + (8t u ) b 

It has been shown that if the failure is to logic-level-1 
or logic-level-O, and if the system does not fail as a result of 
the occurrence of the failure, then the three good clock elements 
continue operation, but at a frequency determined by the new set 
of (Btj) * s and (&t u )'s of the three operating elements. 

Figures 3.12 through 3.15 show how the occurrence of a 

failure may induce a spurious short-duration pulse in the wave- 
4 4 

form of Qj or Q^. In the remaining text of the thesis such a 

short-duration pulse will be referred to as a glitch. It is 

shown below that the clock will tolerate the failure-modes 

illustrated in Figs. 3.12 and 3.13, providing that (St^)^ and 

(St ) . are within prescribed tolerances, but that the failure- 

mode illustrated in Fig. 3.15 may induce a glitch to appear in 

any clock element output - . Because the undesirable transistions 

4 4 

which occur in or may be extremely rapid, slivering may 
occur within the clock elements; the study which will be made in 
each case represents the worst-case analysis. 

Case 1. Again, in this case and those following, reference 
is made to Fig. 3.8. In Fig. 3.12 fails to logic-1 within 
the regions (5t ^) ^ < (St^) 3 ; the analysis of this failure- 

mode follows: 

0 at t = (&t d ) 2 + 2 A (3.78) 

1 at t £1 + 2A (3.79) 

0 at (5t d ) 3 + 2A (3.80) 
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H , -* 
2x 

1 

at 

(Bt d>2 

+ 

3A 


(3.81) 

EL . “> 
2x 

0 

at 

+ 

3A 


(3.82) 

H„.-» 

?.x 

1 

at 

(St d>3 

+ 

3A 


(3.83) 

A -» 
3i 

1 

at 

(8t d>2 

+ 

4A 


(3.84) 

A_ 

3x 

0 

at 

fc fl + 

4A 



(3.85) 

A -* 
3i 

1 

at 

< St d>3 

+ 

4A 


(3.86) 

D _ . 
2x 

0 

at 

< St d>2 

+ 

2A + 

At 

(3.87) 

D_ 

2x 

1 

at 

fc fl + 

2A 

+ At 


(3.88) 

D 

2x 

0 

at 

(8t d>3 

+ 

2A + 

At 

(3.89) 

A . 
4x 

0 

at 

< 8t d>2 

+ 

3A + 

At 

(3.90) 

A. 

4x 

1 

at 

fc fl + 

3A 

+ At 


(3.91) 

A..-> 

4x 

0 

at 

<8t d>3 

+ 

3A + 

At 

(3.92) 


If transition (3.90) occurs while A^^ is at logic-0 (see 

transitions (3.84) through (3.86)) then is driven to logic-0 

at about the same time that it is being driven to logic-1 by the 

4 

propagation of the signal: or< ^ er to avoid this 

situation, set: 

[ (&t d ) 2 + 3A + At] - [ (St d ) 3 + 4A] >0 

or, 

(&t d ) 3 - (&t d ) 2 < At - A (3.93) 

As previously stated, we may define the tolerances of (gt) as 
follows : 

(&t) m (1-x) < (Bt) < (&t) (1 +k) (3.94) 

nom. nom. 
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Combining egns. (3.93) and (3.94) in a worst-case condition: 

(&t) m (1+x) - (St) . (1-x) < At - A 
nom nom . 

or 


x < 


At - 
2 ( 5 1 ) 


A 


nom 


(3.95) 


and since (5 1) = 6A + At, 

nom 


At - A 

X < 2 At + 12A 


(3.96) 


If the system is to be run at maximum frequency. At = 8A, and 

x < ~ . It is seen that there is a tradeoff between the 
4 

tolerance of components and the frequency. 

Case 2. In Fig. 3.13 fails to logic-0 within the 

region: (8t u ) b < < (St^)^; an ana lY s i s similar to that of 

Case 1 yields : 

(St u ) c - (6t u ) b < At + A (3.97) 

Here the requirement on x for successful operation is: 


At + A 

X ^ 2 At + 12A 


(3.98) 


This is not as stringent a requirement as Case 1 provides . 

Case 3. In Fig. 3.14 C^ fails to logic-0 within the 
region: (5t u ) c < t fQ < (5t u ) d . It can be shown (in a manner 
similar to Case 4) that may go tc logic-0 and back to 
logic-1, within a time on the order of one propagation delay. A, 

this tr ans ition occurring shortly before C . is driven to logic-0 

4 1 . 

by the event: Qy* 1 at t= (6t u )^; as a result a glitch may occur 


63 



at the output of N . , but not within C. . In Case 4, however, it 

<3 X 1 

is shown that a glitch may occur in C^. 

Case 4. In Fig. 3.15 C^ fails to logic-1 within the 
region: (6t^) 3 < t^ < ( Bt d ) 4 * The analysis of this failure-mode 
follows: 

4 

As a result of 0 at (6t d ) 3 , is driven to logic-1. 

If when 1 at t f ^, M^ i is at logic-1, A &i may go to 1, driving 

M 3 ^ to logic-0. The length of time that is at logic-0 is £ 

(Stj)^ - t^j as a result C^ may go to logic-1 at about 3A after 

it had gone to logic-0, or a pulse of duration approximately 

equal to A may occur in C^, or C^ may be unaffected, yielding a 

loss of synchronism amongst the clock elements. Also the extra 
4 

pulse in Q may cause a pulse to occur in C. just before C. is 

2 1 X 

set at logic-1. 

Given that C^ fails to logi.c-level-1, the probability that 
system operation is affected as described in Case 4 is the 
probability the t f ^ will occur between ( st d ) 3 and (St^) 4 , or: 


= ^dU ~ * bt d)3 
sf (St d ) 3 . + ■ (St u ) c 


(3.99) 


where the subscript sf stands for system failure. So the smaller 
the tolerance on (St^) , the smaller is the probability of a 
system failure being induced by the failure of a clock element 
to logic-1. It should be noted, however, that if a clock element 
exhibits a failure-mode of random oscillation, that, over an 
extended period of operation, approaches unity. Therefore 

in order to assure single-fault- tolerance, the circuit of Fig. 

3.8 must be modified. Figure 3.16 is the suggested redesign of 
a clock element. The additional circuitry increases the values 
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Fig. 3.16 Ri 






as follows : 


of (St,) and 
d nom 


(St ) 
u' nom 


(St ) = 9A + 

At 

(3.100) 

u nom 


(St ) - 7A + 

cl nom 

At 

(3.101) 


thereby decreasing the duty-cycle slightly. The values of the 

necessary one-shot pulse widths are dependent on the desired 

operating frequency of the system (and therefore are dependent 

on the value of At). Figures 3.17 through 3.20 illustrate how 

the additional circuitry filters out the extraneous pulses of 

Q* dr Q* ; the pulse-width of OS., is C (8t ) + \ (St ) ], 

2x 3i lx Ij 1 nom 2 a nom 

and the pulse-width of OS . is [(St.) + — (St ) ]. 

2i d nom 2 u nom 

As stated earlier all clock lines are distributed to each 
synchronous module in the computing system. If the module 
transducer, were simply a two out of three voter, the occurrence 
of a failure could cause a glitch in the output of the transducer. 
The subsequent slivering which may occur could destroy the 
synchronism of the system. Therefore the module transducer must 
filter out the extraneous pulses. Figure 3.21 is the design of 
a module transducer which may be used in conjunction with the 
Revised Daly-McKenna Clock. The pulse width of the one-shot is: 

[ (St ) + ~ (St.) ]. Figure 3.22 illustrates the operation 

of the module transducer. Note that although the design of the 
single-fault- tolerant clock requires four clock elements, only 
three need to be examined to extract a reliable clock. For the 
sake of uniformity of presentation it has been assumed that all 
clock lines are distributed. In the event of the development of 
a reconfigurable voter (2/3 voting with replacement) , the imple- 
mentation of such in place of the simple majority voter in each 
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Fig. 3.21 Module Transducer for use with 
the Revised Daly-McKenna Clock 



Fig. 3.22 Module Transducer Opera t ion 
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module transducer could increase the reliability of the clocking 
system. 

For n-fault- tolerance, the number of required clock ele- 
ments is 3n + 1, as demons trated by Daly and McKenna. Within 
the module transducer, the voter increases in complexity but the 
remainder of the circuitry remains the same. On the clock system 
level the Daly-McKenna Clock is more costly than the method of 
clocking examined in Section 3.3, but on the module level it is 
much less costly. 

3.4.2 Current Concept 

In October, 1971, McKenna designed a single-fault-tolerant 
clock for use in the prototype fault-tolerant multiprocessor 
(CERBERUS) currently being built at the C.S. Draper Laboratory. 
The design is shown in Fig. 3.23. The clock has been built, 
with medium speed TTL technology, and has been demonstrated to 
survive the imposition of single faults. The frequency of the 
clock now in existence is about 0.7 MHz. 

The operating principles of this clock are similar to 

those of the first concept. It s;hould be noted that here, each 

clock element will oscillate at its own chosen frequency if 

separated from the others. When the system is interconnected, 

however, each element will conform to one common frequency. As 

will be demonstrated, the system behaves as follows: C., after 

1 4 

a short delay, is set to logic-1 by the occurrence of Q -* 1, 

4 2 

or after a much longer delay, by Q -» Q; C. is reset to logic-0 , 

4 2 1 

after a short delay, by Q,-» 0, or after a much longer delay, by 
4 6 

q 3 -» 1. 
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The clock is self-starting; remember that the first 

concept (Fig. 3.8) requires additional logic to induce free- 

running oscillation. Some explanation of the clock element 

structure is called for; the pair of parallel inverters at 

the Q output of the J-K flip-flop are there to increase the 

fan-out capability; similarly the eight inverters at the input 

of each clock element serve to overcome fan-out limitations; 

4 4 

as can be seen, (.Q ) ^ and (Q^) ^ are each implemented through 

four levels of gating; the strings of inverters, to which 
4 4 

(Q ) . and (Q ) . are applied, serve to time certain key signals; 
and the peculiar configuration of the region around the 
retriggerable-one-shot (see Fig. 3.24) is the internal con- 
figuration of the circuit element used. 

As in the first concept, all clock lines are distri- 
buted to each synchronous module within the system. Simple 
two-out-of-three majority voting within the module is not 
sufficient, but, as before, the module transducer illustrated 
in Fig. 3.21 may be used. 

For purposes of analysis of the circuit, the following 
simplifying assumptions are made; 

(1) An element’s propagation delay is the same' for a 
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Inputs < 



Pig. 3.24 Retr iggerable One-Shot used in 
McKenna Clock 


73 




logic-level-1 to 0 transition as for a O-to-1 
transition, 

(2) the delay associated with any inverter or NAND 
gate is equal to A, 

(3) the delay associated with a transition caused by 
clocking the J-K flip-flop is 2A, 

(4) the delay associated with a transition caused by 
an applied zero to the SET or RESET input of the 
J-K flip-flop is 3A, 

(5) the one-shot triggers when its input AND gate 
experiences a transition from logic-level-0 to 1? 
the delay between the input of the AND gate and 
the output of the one-shot is 2A; after the last 
triggering input to the AND gate, the one-shot 
will go to logic-level-0 at a time At later. 

For now, assume that the clock elements are oscillating 
in synchronism; the clock's self-starting capability may be 
examined separately. As in the manner of the analysis of the 

first concept, assume that at time t=t , CL“* 1? t=t , C -» 1; 

A A B B 

t=t_, C -» 1; t=t , C -> 1, where t >t >t >t . Assume that at an 
C C D D D C B A 

initial time of observation, (t -.e), all propagation, within 
each clock element, caused by the previous transition, C^-» 0, 
has ceased; then: 


Initially: 


C. = 0 


«4>i - o, 

( Q 3> i “ 0 

S. = 1; 

Wl 

1! 

h- 4 

3. 

X 

TR, = 1? 

TS . = 1 
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ROS^ is indeterminable from a static analysis . 

Given the initial state and the assumed progression of 
events , the timing analysis is as follows: 


C_-> 1 att = t (3.102) 

A A 

C_-> 1 at t (3.103) 

B B 

(Q*) L ~> 1 at t B + 4A (3.104) 

?.-» 0 at t + 5A (3.105) 

X B 

S.- 1 at t + 8A (3.106) 

X B 

G 1 at t (3.107) 

V-» C 

where, t fi < t c < t 0 + 9A 

C D ~* 1 at t D (3.108) 

where, t Q < < t fi + 9A 

(Q*).^ 1 at t c + 4A (3.109) 

TS j-* 0 at t + 5A (3.110) 

TsL-* 1 at t + 8A (3.111) 


ROS . is triggered either at t * t + 6A or within 

X v-* 

t c + 3A < t < t c + 6A by its own expiration (ROS^-» 0) ; therefore 

ROS . , i = A,B,C, is triggered at t = t + 6A; ROS is triggered 
... x c lj 

within t c + 3A < t < t c + 6A 

ROSp-* 0 within t + 3A + At < t < t^, + 6A + At (3.112) 

R0S_ D 0 at t + 6A + At (3.113) 

A |B|C C 
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C_-» 0 within t + 6A + At < t < t + 9A + At 
D C C (3.114) 

C „ 0 at t + 9A + At (3.115) 

A,B f C C 


ros d 

and ROS^, i 


is triggered within t + 4A + At < t < t 

v v 

= A,B,C, is triggered at t = t + 7A + At 

v 


«S>1* 0 

at 

fc c 

+ 

13 A + 

At 

(Q 3> 1"* 0 

at 


4 

13A + 

At 

TR. -* 0 

X 

at 

fc c 

4 

13 A 

4 

At 

TR ± -* 1 

at 

fc c 

4 

16A 

4 

At 

R. -> 0 

X 

at 


4 

3.5A 

4 

At 

R. -» 1 

X 

at 

fc c 

4 

18A 

4 

At 


+ 7 A + At, 

(3.116) 

(3.117) 

(3.118) 

(3.119) 

(3.120) 

(3.121) 


ROS^ is triggered at t c + 14 A + At. 

ROS. -» 0 at t + 14A + 2At 
1 C 

C. — * 1 at t = t_ + 17A '+ 2At 

x 0 C 

1 at t Q + 4A 

(Q^ ) . 1 at t_ + 4A 

3 x 0 

TS . -> 0 at t_ + 5 A 

x 0 

TS. -> 1 at t. + 8A 

x 0 


(3.122) 

(3.123) 

(3.124) 

(3.125) 

(3.126) 

(3.127) 


ROS^ is triggered at t^ + 6A. 


ROS i -> 0 at t + 6A + At (3.128) 

C i 0 at t Q + 9A + At (3.129) 
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(3,130) 


(Q^) 0 at tg + 13A + At 

(Q^) ± -» 0 at t Q + 13A + At (3.131) 

Before proceeding any further with the analysis of the 

current concept of the McKenna clock, a major fault in the 

4 

design should be pointed out: Q 2 ~> 1 forces C^-» 1 (if this 

transition has not already occurred) at a time 9A later and 

4 

Q -» 0 forces C.-» 0 at a time 3 l 0A later, but in the clock element 

3 1 4 

in which a transition of C. occurs as a result of Q_-» 1 or 

4 1 z 

0, the retriggerable one-shot expires, thereby clocking the 

flip-flop and forcing an unintended change of state; the occur- 
rence of these glitches in two clock outputs could induce a loss 
of synchronization within the system. A possibility for correct- 
ing the design is: remove TR and TS, retrigger the one-shot 

with S or R, and insert delays between S and the SET input of 
the flip-flop and between R and the RESET input of the flip-flop, 
such that the one-shot cannot expire in the half-cycle in which 
has been changed by S or R. To serve such a purpose delay 
times of 6A are sufficient. 

Here, as in the first concept, the occurrence of a 

4 4 

failure may introduce a glitch in Q 2 or subsequent slivering 
within the clock elements, and possible loss of synchronization. 
The same modification is recommended here as was introduced for 
the first concept. Figure 3.25 illustrates the author's revised 
design of the McKenna clock. 

Assume that the rise-time of the one-shots used for 
4 4 4 

filtering Q and Q is A. Then when Q_-» 1, C 0, 11A + At 

2 4 ~ 5 

later ; when Q -» 0, C 1, 10A + At later. The period of the 
clock is: 

T “ 21A + 2At 
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The expiration of the one-shot triggers the one-shot, A 

later? At must be such that the one-shot will not expire again 

before it is retriggered by S. or R,. From time of expiration 

4 \ 

to retriggering by 1 or Q -» 0 (assuming perfect synchronism) 

the time which lapses is: 10A or llA. In order to allow for 
the results of differences in propagation delay amongst clock 
elements (ag., internal delay, phase differences amongst C^) , 

At should be greater than or equal to 20A. Therefore: 

f = 
max 61A 

and if medium-speed TTL technology is used (A a 12ns) , 

f a 1.4 MHz. 
max 

Retriggerable one-shots with timing adjustments are 

available? if these are used they may be adjusted after the 

clock is wired, thereby allowing the synchronous quality to be 

"peaked" . Indeed the use of adjustable retriggerable-one-shots 

obviates concern with At At, (defined in Sec. 3.4.1), or 

u a 

frequency variation caused by a single failure. After the 
clock system has been peaked, deterioration of the synchronous 
quality is restricted to that introduced by component aging or 
environmental factors (such as temperature variations) . 

The retriggerable-one-shot is responsible for the self- 

starting property of the clock. At power-on the flip-flops 

come up in an arbitrary state. If ROS^ comes up logic-1, then 

it has just been triggered and will expire At later? if it 

comes up logic-0, it triggers itself and will expire At later. 

4 4 

Within short order Q^-* 0 or Q 3 ~> 1 occurs and the clock elements 
begin oscillating in synchronism. 
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Fault- tolerance is achieved in this design in the same 
manner as described in Section 3.4.1; if the clock is to be 
n-fault- tolerant, 3n+l clock elements are required. 

3.5 Speeding Up the McKenna Clock 

3.5.1 Advantages of Greater Speed 

The speed of a synchronous processor is directly related 
to the clock frequency. Current technology allows the design 
of a synchronous processor which runs at 20 MHz or more. The 
maximum frequency obtainable by the clock shown in Fig. 3.25 
is, with TTL technology, 1.4 MHz:; to run a processor at this 
rate can be highly inefficient. It is, therefore, desirable to 
have the capability of operating a fault-tolerant clock at a 
substantially greater speed. 

3.5.2 Application of Advanced Device Technology 

The numerical values of clock frequencies which have 
been derived in this thesis,, have been derived assuming the 
usage of medium-speed TTL technology. But there is, today, 
higher speed technology available. In 1968, Motorola began 
marketing a logic family which offers 1 ns gate propagation 
delays (MECL III). The McKenna- type clock shown in Fig. 3.25, 
if implemented in MECL III, could operate at 16.4 MHz. 

Certainly, with the development of higher speed tech- 
nologies, proportionately higher speed fault- tolerant clocks 
may be built; it should be realized, however, that the 
availability of higher speed technologies will also allow the 
design of higher speed processors . As such it is desirable 
to design a fault-tolerant clock which can run at 20 MHz 



utilizing medium-speed TTL technology; then the implementation 
of both processor and clock with any higher speed technology 
will yield a proportionately higher speed computing system. 

3.5.3 Revised Circuit 

Simply by connecting TR^ to the SET input of the flip- 

flop and TS^ to the RESET input of the flip-flop, the frequency 

is tripled. The revised clock element is shown in Fig. 3.26. 

4 

The basic operating principles ares Q 0 induces C.-> 1 (after 

4 2 1 
a delay, 9A) and Q -» 1 induces C.-» 0 (after a delay, 10A) . 

The period is given by: 

T = 19A 

For medium-speed TTL technology, f & 4.4 MHz. The retrigerable- 

one-shot remains high during synchronous operation; it is 

needed here only for initialization of oscillation. In order 

to assure a start, only three of the four clock elements of a 

single-fault-tolerant clock need be provided with a retrig- 

gerable-one-shot and its associated circuitry. Consider such a 

design where for ROS , At = 20A; ROS., At = 35 A; and ROS , 

a jd c 

At = 55 A. At power-on the flip-flops come up in an arbitrary 
state and the retriggerable-one-shots are triggered. Fig. 3.27 
shows how the clock is started from each of the sixteen possible 
initial conditions. If one of the retriggerable-one-shots 
fails, a start is still assured. 

The disadvantages of this design stem from the circum- 
stance that the frequency is dependent solely on gate delays. 
Because there are no devices having adjustable time delays 
within the clock elements, it is not simple to design the 
clock for a particular frequency; neither is it possible 
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Fig. 3.27 Analysis of self -starting operation o 


Synchronization has been 
achieved when: 


C£ -*1 at 56A 
q —0 at 51A 
C£ —0 at 51A 
— 0 at 36A 
C £ —1 at 50A 
C£ — 0 at 36A 
C£ —0 at 36A 

Ci —1 at 70A 
C£ —0 at 71A 

Gj -1 at 35A 
C£ —1 at 35A 

C£ -*0 at 51A 
— ► 1 at 35A 
C£ -• 1 at 50A 
Ci — 1 at 50A 
C£ -»0 at 57A 


revised circuit 






































































to minimize the phase differences between clock elements through 
the method of peaking described in Section 3.4.2. 

3.5.4 Increased Speed by Frequency Multiplication 

The frequency of the distributed clock can be increased 
through use of the concept illustrated in Fig. 3.28 for single- 
fault-tolerance. c_ , C , C , and C are the outputs of a 

A o C U 

single-fault- tolerant McKenna- type Clock? these "low frequency" 
clocks are supplied to triplicated two-out-of-three voters, the 
outputs of which are each supplied to a frequency multiplier 
(x. N) . The outputs of the multipliers, C^, C 2 » and are 
distributed to the data management system. 

Several methods of synthesizing the frequency multipli- 
cation come to mind? the first is illustrated in Fig. 3.29. 

The method of operation of Fig. 3.29 may be described as a 
burst generator? for each pulse of the input, the multiplier 
produces a "burst" of n pulses. The limitations of this 
circuit are determined by: frequency and frequency variation 

of the input, minimum pulse width producible by the one-shot, 
tolerance in the propagation delay of the one-shot, tolerance 

in the pulse width, minimum delay-times available, tolerances in 
delay-times, and the tolerance in the propagation delay of the 
OR gate. The limitations imposed by the tolerances may be 
minimied by careful component selection? nevertheless some 
tolerance must be allowed: + 2 % seems reasonable. Assume that 

a one-shot is available with a pulse width of 25 ns and a rise 
time of 10 ns. Assume that a tapped delay line is available 
with total delay 500 ns and taps available at 50 ns intervals 
(the Digital Equipment Corporation manufacturers just such a 
delay line. They claim a tolerance from the input to each 
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Fig. 3 . 29 Burst Generator 
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delay tap of + 5%; hence that tolerance will be allowed for the 
delays considered here) . There is a trade-off between the 
quality of synchronization at the outputs of the three frequency 
multipliers and the number of 50 ns delays used. Not including 
delays introduced by the gates, corresponding pulses passing 
through independent 500 ns delays may be separated, at the 
output, by as much as 50 ns (if 25 ns pulses are considered, 
the trailing-edge-to-leading-edge separation is 25 ns; synchro- 
nism has been destroyed) . If 25 ns pulses pass through inde- 
pendent delays, and it is desired at the output to have a mini- 
mum overlap of 15 ns, then the maximum difference between 
delay- times is 10 ns and hence for an allowed + 5% tolerance, 
the largest delay- time which may be used is 100 ns. Therefore, 
it is concluded that unless precision delay lines are made 
available, the frequency multiplier of Pig. 3.29 is restricted 
to producing an integral multiple of 3 or less if it is to 
produce a 20 MHz clock, synchronously with other similar 
frequency multipliers. The limitation on the multiplying 
capability of Fig. 3.29, requires that the input frequency be 
6.7 MHz for the output frequency to be 20 MHz. As developed 
in Section 3.5.3, the operating frequency of the revised, and 
fastest, McKenna- type clock is 4.4 MHz for a medium-speed TTL 
implementation . 


The difficulty with the multiplier of Fig. 3.29, suggests 
the design shown in Fig. 3.30. The voting between multiplying 
stages corrects for the phase differences amongst the outputs 
of the multipliers of the preceding stage. M^., M 2i' and M 3i 
are illustrated in Figs. 3.31, 3 . 32, and 3.33 respectively. 

In Fig. 3 . 31, the one-shot pulse width is 100 ns; in Fig. 3.32, 
50 ns; in Fig. 3.33, 25 ns. Note that for a 20 MHz output only 
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Fig. 3.30 Frequency Multiplication through use of 
cascaded burst generators 

























a 1 MHz input is required; hence the requirements placed on the 
design of the fault- tolerant clock may be lessened. 

Another method of synthesizing the frequency multipli- 
cation is through use of a phase-locked loop, as illustrated in 
Fig. 3.34. The failure of a system which utilizes phase-locked 
loops for frequency multiplication, however, is that in the 
event of an input clock failure, and hence a subsequent change 
in input frequency, independent phase-locked loops may not 
maintain synchronism while their outputs adjust to the new 
frequency. 

It is concluded that for speeding up the McKenna Clock 
(or any other slow clock) the cascaded burst generator design 
offers the most viable solution. 

3.6 Methods of Synchronization Used in Pulse-Code 
Modulation 

The synchronization of pulse-code modulation (PCM) 
networks has long been a subject of interest. The question of 
synchronizing switching centers, in addition to transmission 
links, arose in 1956, when the PCM telephone switching experi- 
ment, later named Essex, was planned. 

Possible methods of synchronization as described by 
Mumford and Smith (Ref. 3), are as follows: 

(1) Homochronous system . One station in the network 
has a master oscillator, and all tho others are locked to it, 

(2) Synchronous system . Each station is phase-locked 
to the average of several signals. 
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Fig. 3.34 Frequency multiplication by use of a 
phase- locked loop 
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(3) Quasi synchronous or mesochronous system . Each 
station has its own frequency control, but the bit rate can be 
changed to one or two or three discrete rates, or the frame 
rate may be changed by adding or dropping special bits, so as 
to ensure that the average frequency of each station is the 
same as any other, if a long period is considered. At any 
instant there will be a phase error between an incoming signal 
and the local signal, which must not be allowed to exceed a 
specified maximum in operation. 

(4) Heterochronous system . Each station generates its 
own frequency within a specified tolerance of the nominal 
frequency. The tolerance must be kept small enough to reduce 
to negligible proportions the loss of information vhich occurs 
when a fast signal arrives at a slower station. 

The only one of the four systems named above which offers 
potential application to the design of a fault- tolerant clock 
is the synchronous system. Such a method of synchronization of 
oscillators falls into the general method illustrated in Pig. 
3.2. For use in PCM, schemes have been developed which consist 
of averaging the phases of all oscillators, comparing the result 
with each oscillator phase, and applying an error signal as a 
correction to the oscillator frequency. The oscillators which 
are used have frequencies which may be altered in proportion to 
a control signal; in the absence of external control , each 
oscillator operates at a different frequency. A system which 
has been examined by Gersho and Kara fin (Ref. 4) is illustrated 
in Fig. 3.35. It has been proven that under suitable conditions 
the system is stable; i.e. , the oscillators asymptotically 
settle to a common frequency and the phase differences have 
finite asymptotic values . The system illustrated in Fig. 3.35 


91 




Fig. 3.35 Synchronization of PCM station 
oscillators - station i 
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is offered by Gersho and Kara fin as an abstraction of the more 
complex practical systems that have been developed? the system 
of Fig. 3.35 requires a total phase comparison to be made, 
■which is impractical to implement. In the PCM system, failure 
of a transmission link will lead to resynchronization if the 
remaining network is still connected. Also, in the case of 
oscillator failure, the remaining N-l oscillators will resyn- 
chronize to a new frequency, if the resulting network of N-l 
stations is. still connected, after removal of all transmission 
links entering or leaving the inoperative station. 

It is impractical to utilize the above system in the 
design of a fault-tolerant clock. Not only must provision be 
made to remove a failed oscillator from the system, but during 
the time when the system is adjusting to a new frequency, as a 
result of the failure of a link or oscillator, synchronous 
operation cannot be assured. 
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CHAPTER 4 


CONCLUSIONS 


In a multiprocessor which achieves fault-tolerance 
through replication, corresponding units must be kept in 
synchronism. In Chapter 2 the synchronization requirements 
have been defined and methods of maintenance of synchronism 
have been developed; the primary conclusion to be drawn from 
Chapter 2 is that a synchronous fault-tolerant multiprocessor 
driven by a fault-tolerant clock is more efficient and more 
easily implemented than is an asynchronous fault-tolerant 
multiprocessor. This conclusion leads naturally to the exami- 
nation of fault- tolerant clocking. 

In Chapter 3 specifications have been developed for a 
fault-tolerant clock and two general methods of design have 
been explored. It has been concluded that fault- tolerant 
clocking through failure-detection and subsequent clock substi- 
tution is possible but impractical to implement in a system of 
many synchronous modules due to the high cost of each module 
transducer. Fault- tolerant clocking through the concepts . 
advanced by William Daly and John McKenna has been studied 
intensively; clocks developed by Daly and McKenna have been 
examined, refined, and revised. It has been concluded that it 
is desirable to have available a fault- tolerant clock which 
runs at 20 MHz, but that such a frequency is not achievable by 
a McKenna- type clock (with use of current technology). A 
method of achieving a 20 MHz clock by the use of a relatively 
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slow McKenna- type clock in conjunction with a frequency multi 
plier has been developed in Section 3.5.4. 
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